REMARKS 

Claims 12-13, 21-24, and 26-32 are pending. Claim 25 has been cancelled, as the 
amendment to claim 24 makes claim 25 redundant. Support for amended claims 12 and 21- 
24 and new claims 26-32 derives from the specification and claims as originally filed. For 
example, computational methods for the generation of primary libraries are described at page 
7, line 22, through page 8 line 12, and page 10, line 9 through page 15, line 14. Methods for 
the generation of secondary libraries from primary libraries are described at page 26, line 27, 
through page 30, line 27. Support for the synthesis of variant proteins, begirming with the 
corresponding oligonucleotide sequences using multiple PCR can be found at pages 31-32. 
Methods for isolating, purifying, and expressing the oligonucleotide sequences as proteins are 
well known in the art, and are described at pages 41-47 and in the Examples. Accordingly, 
the amendments do not present new matter and entry is proper. 

Rejections under 35 U,S.C, § 112, first paragraph 

Claims 12-13 and 21-25 are rejected under 35 U.S.C. § 1 12, first paragraph for failing 
to comply with the written description requirement. In rejecting claim 12-13 and 21-25, the 
Examiner's position appears to be that: (1) the specification does not describe how a library 
of primary sequences can be received, and (2) that the specification is enabling only for the 
design of enzymes {see^ page 7 of the final office action). 

In response to the Examiner's first point, Applicants have amended claim 12 to clarify 
that the primary library of primary variant sequences is provided via a computational 
processing method based on force field calculations using the coordinates of a target protein. 
Applicants respectfully submit that this amendment overcomes the Examiner's first point. 

In response to the Examiner's second point. Applicants respectfully submit that the 
specification enables a method for computationally generating a genus of secondary libraries 
comprising variant sequences in which the starting protein structure {i.e. scaffold) can be any 
protein for which a three dimensional structure is known or can be generated. In addressing 
the written description requirement under 35 U.S.C. § 1 12, the Federal Circuit in University 
of California v. Eli Lilly and Co., 43 USPQ2d 1398, 1406 (Fed. Cir. 1997), stated: 

A description of a genus of cDNAs may be achieved by means 
of a recitation of a representative number of cDNAs, defined by 
nucleotide sequence, falling with the scope of the genus or of a 
recitation of structural features common to the members of the 
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genus, which features constitute a substantial portion of the 
genus. This is analogous to enablement of a genus under 
Section 1 12, para. 1, by showing the enablement of a 
representative number of species within the genus. See 
Angstadt, 537 F.2d at 502-03 (deciding that applicants "are not 
required to disclose every species encompassed by their claims 
their claims even in an unpredictable art and that the disclosure 
of forty working examples sufficiently described subject matter 
of claims directed to a generic process) . . . See also In re 
Grimme, 21 A F.2d, 949, 952 ("[I]t has been consistently held 
that the naming of one member of such a group is not, in itself, 
a proper basis for a claims to the entire group. However, it may 
not be necessary to enumerate a plurality of species if a genus 
is sufficiently identified in an application by other appropriate 
language."). 

Applicants respectfully submit that the specification provides written description 
support and enables a method for computationally generating a genus of secondary libraries 
comprising variant sequences in which the starting protein structure {i.e. scaffold) can be any 
protein for which a three dimensional structure is known or can be generated. Proteins 
suitable as starting structures for the generation of secondary libraries are outlined on page 9, 
line 8 through page 10, line 8. 

Regarding the issue that "any scaffold protein" is too broad, it is clear fi:om the 
description provided in the specification and well known in the art, that a scaffold protein is 
one that must already have a structure known (e.g. 3D), and is therefore a required input to 
the method and not a required part of the method. Said scaffold protein may be obtained 
using well-known structural biology techniques such as X-ray crystallography and NMR 
spectroscopy, which are commonly made available publicly in the protein databank. 
Alternatively, a homology model may be generated using one or more structures that are 
publicly available and algorithms well known in the art. The current invention does not 
require the user to "generate a scaffold for a single, let alone, for a library of any organisms", 
as is stated. Rather, the current invention provides methods of protein design, which can be 
applied to an existing scaffold protein and/or set of corrdinates. In addition, the removal of 
water, SO2, and other extraneous non-protein artifacts from an available structure is common 
practice in the field, and is carried out simply by editing the structure file. Such editing is 
neither a restriction for the application of the current method nor does it represent undue 
experimentation. 
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The computational methods that can be used in generating the secondary libraries are 
known to those of skill in the art. Finally, two working examples for the generation of 
variant libraries are provided, using a combination of the computational and experimental 
methods is described in the specification as filed. 

In support of the position that the present invention is enabled, Applicants enclose 
herewith a number of publications that are both prior and subsequent to the filing date of the 
present application, that address the enablement of the invention. These are not offered to 
augment the disclosure of the application; rather, the work is presented to show that present 
invention is enabled for any protein for which a defined set of coordinates can be generated. 
See In re Wilson, 135 USPQ 442, 444 (CCPA 1962); Ex parte Obukowicz, 27 USPQ 2d 1063 
(BPAI 1993); Gould v, Quigg, 3 USPQ 2d 1302,1305 (Fed. Cir. 1987): 

"it is true that a later dated publication cannot supplement an 
insufficient disclosure in a prior dated application to render it 
enabling. In this case the later dated publication was not 
offered as evidence for this purpose. Rather, it was offered . . . 
as evidence that the disclosed device would have been 
operative" printed publications. 

In the article "Proteins from Scratch" (DeGrado, Science 278:80-81, 1997, a copy of 
which is enclosed as Exhibit A), biochemistry professor William F. DeGrado of the 
University of Pennsylvania School of Medicine, a world-renowned expert in protein 
structure, folding and design, comments on the computational platform designed by Dahiyat 
and Mayo in Science 278:82-87 (1997). This platform is an earlier version of the 
computational platform that has evolved and is claimed herein. Dr. DeGrado states: 

Not long ago, it seemed inconceivable that proteins could be 
designed from scratch. Because each protein sequence has an 
astronomical number of potential confirmations, it appears that 
only an experimentalist with the evolutionary life span of 
Mother Nature could design a sequence capable of folding into 
a single, well-defined three dimensional structure. But now on 
page 82 of this issue, Dahiyat and Mayo describe a new 
approach that makes de novo protein design as easy as running 
a computer. 

Dr. DeGrado further states (col 1, paragraph 3): 

Thus, the problem of de novo protein design reduced to two 
steps: selecting a desired tertiary structure and finding a 
sequence that would stabilize this fold. Dahiyat and Mayo 
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have now mastered the second step with spectacular success. 
They have distilled the rules, insights and paradigms gleaned 
from two decades of experiments into a single computational 
algorithm. . .Thus the rules of . . .computational methods for de 
novo design may now be sufficiently defined to allow the 
engineering of a variety of proteins. 

Holmes (New Scientist, 1 1 October 1997, enclosed herein as Exhibit B), quoting Dr. 
Wells, a protein engineer at Genentech, states, "This [the Dahiyat and Mayo work] will stand 
as a landmark piece of work". 

In an article by Borman (Chemical and Engineering Newsletter, October 6, 1997, 
enclosed as Exhibit C) George D. Rose, formerly professor of biophysics and biophysical 
chemistry at the Johns Hopkins University refers to the work of Dahiyat and Mayo stating, 
"Dahiyat and Mayo have taken protein engineering to a new high". 

On page 6, the Examiner states that "[a]s a skilled in the art appreciates, to date there 
are too numerous obstacles for the design of even a single secondary structure of a protein, let 
along, all or any kinds of proteins". 

Professor DeGrado, Dr. Wells and Professor Rose all highlight the concept of the 
Dahiyat and Mayo computational protein design as a significant breakthrough in science. In 
addition, since the computational platform uses scaffold proteins (by definition protein 
structures), there is no discrimination as to the particular "type" of protein, be it enzyme or 
otherwise. As the Examiner will appreciate, despite the fact that proteins display an 
enormous array of functions, all are naturally composed of the same 20 amino acids, all 
governed by the same physico-chemical principles. 

Further, Applicants' have designed many proteins that are not "enzymes". For 
example, see the articles enclosed described computationally designed GCSF (US 6627186 
and Luo P et al., Protein Science 11, 1218-1226 (2002); enclosed herein as Exhibits D and E), 
Interferon Beta (US 6514729, enclosed herein as Exhibit F) and TNF-alpha (US publication 
No. 2003/138401 and Steed PM et al, Science 301, 1895-1898 (2003); enclosed herein as 
Exhibits G and H), for example. These non-enzymatic proteins have a variety of structures 
and have all been successfully designed. Thus, it is improper to limit the scope of this 
invention to just "enzymes". 
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The articles, patents and patent applications discussed above, and in particular, the 
commentary of Professor DeGrado, support the enablement of the methods disclosed in the 
pending claims. Importantly, the methods apply to proteins in general, regardless of whether 
the protein is an enzyme, as described in the example, or an antibody, cell surface receptor, or 
other protein of interest. 

Accordingly, Applicants respectfully submit that the specification fully enables the 
present claims, and respectfully request withdrawal of the rejection under 35 U.S.C. § 1 12, 
first paragraph. 

Rejections under 35 U.S.C. § 112, second paragraph 

Claims 12-13 and 21-25 are rejected under 35 U.S.C. § 1 12, second paragraph, as 
being indefinite. In rejecting claim 12-13 and 21-25, the Examiner refers back to the Office 
Action mailed October 2, 2002. In reviewing the Office Action mailed October 2, 2002, 
Examiner's first point under paragraph A appears to be that as written, independent claim 12 
omits essential steps. 

As discussed above, Claim 12 has been amended and thus, the rejection should be 
withdrawn. 

The Examiner other point under paragraph A appears to be that applicants have 
acquiesced to the rejection regarding the term "probability" by not responding thereto. 
Applicants respectfully disagree. The term probability is used in connection with a 
"probability distribution table", which is a term of art. The generation and information 
comprising the probability tables used in the methods disclosed in claims 12-13 and 21-25 are 
described throughout the specification, see, e.g., page 23, line 19 through page 30, line 6, and 
in the Examples. A probability distribution table was generated in the examples, first by 
analyzing the primary library in the Monte Carlo analysis (Table 3) which generated the 
probability of each possible amino acid at each position and put it in a table (Table 3), e.g. 
probability table. Table 4 shows how this distribution can be rounded. Accordingly, 
Applicants respectfully submit that the term "probability" when used in connection with 
"probability distribution table" does not connote uncertainty. 

Accordingly, Applicants respectfully request withdrawal of the rejection under 35 
U.S.C. § 1 12, second paragraph as set forth in paragraph A. 
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Applicants acknowledge the Examiner's statement that the rejections under 
paragraphs B-E are either withdrawn or are moot. 

Under paragraph F, the Examiner's first point appears to be that claims 21-25 do not 
further limit independent claim 12 because claim 12 recites protein variants and claims 21-23 
relate to the synthesis of oligonucleotides. Claim 21 has been amended to clarify that the 
relation between claim 12 and claims 21-23. 

The Examiner also rejects the use of the terms "relative amounts", "equimolar 
amounts" and "correspond". Claims 21-24 have been amended to clarify the metes and 
bounds of these amounts. Claim 25 has been cancelled. Accordingly, Applicants 
respectfully request withdrawal of the rejection under 35 U.S.C. § 1 12, second paragraph as 
set forth in paragraph F. 

Rejections under 35 U.S.C. § 103(a) 

Claims 12-13 and 21-25 are rejected under 35 U.S.C. § 103(a) as being unpatentable 
over Mayo et al, WO/98/47089. This rejection is reiterated for the reasons set forth in the 
Office Action mailed October 2, 2003. The Examiner's position appears to be that as the 
multiple PCR technique is known in the art, it would have been obvious to one of skill in the 
art to synthesize the sequences of Mayo using this technology. Applicants respectfully 
disagree. 

Independent claim 12 discloses a computational method for making secondary 
libraries from primary libraries comprising the steps a) providing a primary library 
comprising a plurality of primary variant sequences computationally generated using a force 
field calculation; b) computationally generating a probability distribution table of amino acid 
residues in a plurality of variant positions from said primary variant sequences; and, c) 
computationally combining a plurality of said amino acid residues to generate a secondary 
library of secondary variant sequences; wherein at least one of said secondary variant 
sequences is different from said primary variant sequences. 

Applicants respectfully submit that Mayo does not disclose methods for 
computationally generating a secondary library from a primary library. Mayo does, as the 
Examiner points out, describe the use of rotamer libraries and force field calculations to 
generate sequences. However, these sequences are the equivalent of the primary sequence 
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libraries described in the current invention (e.g., Step (A) of claims 26-33). The rotamer 
library does not constitute such a primary library as described. The rotamers of the Mayo 
invention and the current invention are defined before any calculation is done, and in fact 
they are the starting point for an analysis using force field calculations. 

To establish a prima facie case of obviousness the prior art reference (or references 
when combined) must teach or suggest all the claim limitations. The teaching or suggestion 
to make the claimed combination and the reasonable expectation of success must both be 
found in the prior art, and not based on applicant's disclosure. In re Vaeck, 947 F.2d 488, 20 
USPQ2d 1438 (Fed. Cir. 1991) M.P.E.P. §2143. 

As argued above, Applicants respectfiilly submit that Mayo does not disclose methods 
for computationally generating secondary libraries from primary libraries. Accordingly, 
Applicants respectfully request the rejection under 35 U.S.C. § 103(a) be withdrawn. 

The Examiner is invited to contact the undersigned at (415) 781-1989 if any issues 
may be resolved in that manner. 



Four Embarcadero Center 
Suite 3400 

San Francisco, CaHfomia 94 1 1 1 -4 1 87 
Telephone: (415)781-1989 
Fax No. (415)398-3249 
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Proteins from Scratch 

William F. DeGrado 



Not long ago, it seemed inconceivable that 
proteins could be designed from scratch. Be- 
cause each protein sequence has an astro- 
nomical number of potential conformations^ 
it appeared chat only an experimentalist with 
the evolutionary life span of Mother Nature 
could design a sequence capable of folding 
into asingle, well-defined three-dimensional 
stnicture. But now, on page 82 of this issue. 
Dahiyat and Mayo ( 1 ) describe 
a new approach that makes dc 
novo protein design as easy as 
running a computer program. 
Well almost. . . 

The intellectual roots of this 
new work go back to the early 
1980s when protein engineers 
first drought about designing 
proteins (2). At that point, the 
prediction of a protein's three- 
dimensional structure from its 
sequence alone seemed a diffi- 
cult proposition- However, they 
opined that the inverse prob- 
lem — designing an amino acid 
sequence capable of assuming a 
desired three-dimensional stmc- 
ture — would be a more tractable 
problem, because one could 
"over-engineer" the system to fa- 
vor the desired folding pattern. 
Thus, the problem of de novo protein design 
reduced to two steps: selecting a desired ter- 
tiary structure and finding a sequence that 
would stabilize this fold. Dahiyat and Mayo 
have now mastered the second step with spec- 
tacular success. They have distilled the rules, 
insights, and paradigms gleaned from two de- 
cades of cxperimenis (3) into a single compu- 
tational algorithm that predicts an optimal 
sequence for a given fold. Further, when put to 
the test the algorithm actually predicted a 
sequence that folded into the desired three- 
dimensional structure. Thus, the rules of pro- 
tein foldir^ and computational methods for 
de novo design may now be sufficiently de- 
fined to allow the engineering of a variety of 
proteins. 

Dahiyat and Mayors program divides the 
interactions that stabiliie protein structures 
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into three categories: interactions of side 
chains that are exposed to solvent, of side 
chains buried in the protein interior, and of 
parts of the protein that occupy more interfa- 
cial positions. Exposed residues contribute to 
stability, primarily through conformational 
preferences and weakly attractive, solvent- 
exposed polar interactions (4). The burial of 
hydrophobic residues in the well-packed in- 




Betler than the real thing. The natural zinc Hnger protein Zif268 (left) Is 
stabilized In part by a core of hydrophobic (green) side chains d f^^wja'" 
chelating side chains (red). In the designed protein FSO-1 (right), the 
Zif268 core is retained but the metal-chelaling His residues and one ot the 
Cys residues of Zif268 are converted to hydrophobic Phe and Ala rest- 
dues thereby extending the hydrophobic core. The fourth metal ligand 
Cys®' is converted to a Lys residue. The apolar portion of this Interfacial 
residue shields the hydrophobic core, whereas its annmonium group is ex- 
posed to solvent. The helix Is also stabilized by an N-capping interaction 
( 19}, wtiich presumably also stabilizes the structure. 



ccrior of a proteir\ provides an even more 
powerful driving force for foldir\g. The side 
chains in the interior of a protein adopt 
unique conformations, the prediction of. 
which is a large combinatorial problem. 

One important simplifying assumption 
arose from the early work of Jainin et oI. (5), 
who showed that each individual side chain 
can adopt a limited number of low-energy 
conformations (named rotamers), reducing 
the number of probable conformers available 
to a protein. This work was subsequently ex- 
tended to the design of proteins containing 
only the most favorable rotamers (6). Al- 
though the side chains in natural proteiru 
deviate from ideality in a few cases (compli- 
cating the predicrion of the structures of 
natural proteins), these deviations need not 
be corwidered in the design of idealized pro- 
teins. Thus, various algorithms have been 
developed to examine all possible hydropho- 
bic residues In all possible rotameric states, to 
find combination* that efficiently fill the in- 
terior of a protein. A complementary ap- 



proach uses genetic methods to exhaustively 
search for sequences capable of filling a pro- 
tein core (7), and this work has been adapted 
for the de novo design of proteins (8). 

Interfacial residues are also quite im- 
portant for protein stability (9i 10). Tliey 
are often amphiphilic (for example, Lys, 
Arg, and Tyr) and their apokr atoms can 
cap the hydrophobic core, while their po- 
lar groups engage in electrostatic and hy- 
drogen-bonded interactions. 

Until reccndy, protein designers have fre- 
quently concentrated on quantifying the en- 
ergetics associated with just one of these three 
types of intcractiortt <3). However, de novo 
design is best approached by simultaneously 
considering all of the side chains in the pro- 
tein — unfortunately, a very high-order com- 
bii\atorial problem. For irwtance, the volume 
available to the interior side 
chaiiu dcj>cnds on the nature ar^d 
conformation of the residues at 
the interfacial positlorw and vice 
versa. Dahiyat and Mayo assumed 
that ea^ of these three features 
had been adequately quantitatcd 
to provide a useful empirical en- 
ergy function for protein design. 
Their program combines a num- 
ber of feaures taken from earlier 
potential funaions and includes 
a penalty for exposing hydropho- 
bic groups to solvent. Another es- 
sential innovauon included in 
their program is an implementa- 
tion of the Dead-End Elimina- 
uon theorem, to efficiently 
search through sequence and side 
chain rotamer space. 

Dahiyat and Mayors target 
fold is a line finger, a motif with 
a well-established history in protein struc- 
ture prediction and design. In an early, pre- 
scient paper, Berg concectly inferred that this 
His^Cys^ Zn-bindir\g motif must feature a P- 
P-a fold that would posirion the ligating 
groups in a tetrahcdral anay around the 
bound Zn{ll) (11). Favorable metal ion- 
ligand interactions together with a small 
apolar core help subiliie the three-dimen- 
sional structure of this compact fold. More 
recently, Imperiall and co-workers have de- 
signed a peptide that folded into this motif, 
even in die absence of metal ions (12). The 
design included a D-amino acid to stabilize a 
type ir turn, and a large, rigid tricyclic side 
chain that may help consolidate the hydro- 
phobic core. This work was particularly ex- 
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citing because, before tbeir studies, itwasnot 
expected that sequences as short as 25 resi- 
dues in length could fold into stable tertiary 
structures. 

Now, Dahiyac and Mayo take these studies 
one step further through the design of a se- 
quence composed of only natural amino acids 
that adopts the zinc finger motif. As input to 
their progratn. they introduced the coordi- 
nates of the backbone atoms from the crystal 
stnicture of the second domain of the linc 
finger protein 2if268. The program then 
evaluated a total of 10" possible side chain- 
rotamer combinations to find a sequence ca- 
pable of stabilizing this fold without a bound 
metal ion. The resulting protein sequence 
shares a small hydrophobic core with its pre- 
decessor from Zif268. However, in the newly 
designed protein FSD-1 the core is enlarged 
through the addition of hydrophobic resi- 
dues that fill the space vacated by the re- 
moval of the metal-binding site (see the fig- 
ure). This increase in the size of the hydro- 
phobic core together with the enhancements 
in the propensity for forming the appropriate 
secondary structure provide an adequate 
driving force for folding. The designed 
miniprotein actually folds into the desired 
structure as assessed by nuclear magnetic 
resonance spectroscopy, and the observed 
structure closely resembles the three-dimen- 
sional structure of Zif268. 

Because of its small size, the protein is 
marginally stable. A Van*t Hoff analysis of 
the thermal unfolding curve gives a change 
in the enthalpy (AHvh) of approximately 
-10 kcal/mol. and indicates that the protein 
is about 90 to 95% folded at low tempera- 
cures (13). The small value AHvH and the 
lack of strong cooperativity in the unfolding 
transition are expected for a native-like pro- 
tein of this very small size (J 4). Thus, FSD-1 
is the smallest protein known to be capable 
of folding into a unique structure without the 
thermodynamic assistance of disulfides, 
metal ions, or other subunits. This important 
accomplishment illustrates the impressive 
ability of Dahiyac and Mayors program to 
design highly optimized sequences. 

This new achievement caps a banner year 
for de novo protein design. Earlier, Regan (J 5) 
answered the challenge of changing a protein's 
tertiary structure by altering no more than 50% 
of its sequence. And although Dahiyat and 
Mayo have demonscraced that the stabilizing 
metal-binding site is not necessary in their sys- 
tem, Caiadonna. Hellinga, and co-wotkers 
(16) have made impressive progress in auto- 
mating the introduction of functional metal- 
binding sites into the three-dimensional struc- 
rures of natural proteins. Further, other workers 
(J 7) have used less autonwited approaches to 
successfully introduce functior\ally and spec- 
troscopically interesting metal-binding sites 
into dc novo designed proteins- 



To date, the most compuutionally inten- 
sive protein design problems have been the 
redesign of natural proteins of known three- 
dimensional structure. But the new automated 
approaches open the door to the de novo design 
of structures with entirely novel backbone con- 
formations. It will be interesting to see if 
Dahi>'at and Mayors approach of designing an 
optimal sequence for a given fold is sufficienc. 
or if it v^ill be necessary also to destabilize alter- 
nate possible folds. Indeed, when using an ear- 
lier version of their algorithm to repack die 
interior of the coiled coil from GCN4» they had 
to retain the identity of a buried Asn residue 
from d^e wild-type protein. Although the in- 
clusion of this Asn actually destabilized the 
desired fold, ic was. nevertheless essential id 
avoid die formation of alternate, unwanted 
confonners (18). The ability to ask such fo- 
cused cjuestions will reveal much about how 
natural proteins adopt their folded conforma- 
tions while simultaneously allowing the design 
of entirely new polymers for applications rang- 
ing from catalysis to pharmaceuticak. 
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The bigger the better 

Size is all-important for cells that want to get around 



TINX free-floating bacteria that may have 
been among the earliest life forms on Earth 
were probably not equipped to move 
about. So says an American biophysicist 
who has calculated that microbes below 
a certain size carmot derive any benefit 
from being able to steer 
themselves around. The 
work adds a constraint to 
evolution that should apply 
anywhere in the Universe. 

Exactly how small a ceU 
can be is controversial. 
Although most biologists 
believe no bacterial cells 
less than 0-2 micrometres 
long exist, some claim to 
have evidence for cells 
a tenth of this size. The 
debate hotted up last year, 
after NASA scientists 
claimed to have found 
such miniature microbes 
fossilised inside a Martian 
meteorite (This Week, 17 
August 1996, p 4). 

David Dusenbery of 
the Georgia Institute of 
Technology in Adanta says 
he has come across part 
of the answer by studying 
the physics of the micro- 
bial environment, rather 
than the bacteria themselves. "How much 
information an organism can gather about 
its environment, and how it then acts, 
dearly has physical constraints," he says. 

Dusenbery found that on the nanometre 
scale, the ability of a bacterium to swim in 



a particular direction can be fairly accu- 
rately modelled- with-just-a. handful of 
equations. These define how the organisms 
would react in different conditions of 
light, temperature and chemistry. It turns 
out that in almost every circumstance, 




Nice mover: large bacteria have in-built "motors" 



free-floating cells below 0-6 micrometres 
simply do not benefit from swimming, no 
matter how powerful their "motors". 

The reason, says Dusenbery, is that such 
organisms live^in a physical world that 
deHes our everyday intuition. For example. 



on small scales, water is a very viscous 
fluid. If a very small bacterium stopped 
swimming even momentarily, it would 
grind to a halt in a few atoms' length. 
Furthermore, the random buffeting by water 
molecules would be more than enough- to 
push the cells off-course. 

Dusenbery concludes that if 
motility can't do a small cell 
any good, it should not evolve 
in such cells. To test that 
prediction, he compiled a list 
of 218 genera of free-floating 
bacteria, about half motile and 
half non-motile. In general, 
motile bacteria were larger 
than their less active kin, and 
none of the motile bacteria fell 
below the 0-6 micrometre cut- 
off {Proceedings of the National 
Academy of Science, vol 94, p 
10949). Dusenbery is confident 
that the same will be true on 
any other planet: "As long as 
the laws of physics were the 
same, the same rules apply." 

"It's intriguing work — he 
seems to have gone into all • 
the aspects of environment 
you can imagine," conunents 
Robert Macnab of Yale 
University, who studies the 
biology of bacterial motion. 
But he adds that the protein motors 
that power modem bacteria are incredibly 
complex, requiring more than 60 genes. 
That alone might explain why only larger; 
more complex bacteria are capable of 
independent motion. Philip Cohen 



First-ever designer protein fits like a glove 



A MOLECULAR tailor's shop has turned out the world's first made-to- 
order protein. The feat opens the door to a future in which scientists 
might design bespoke proteins for drugs or industrial catalysts. 

A protein is a long chain of amino acid building blocks, tightly folded 
on itself like a scrunched-up piece of string. Electrically charged parts 
remain exposed to surrounding water molecules, while neutral parts 
are tucked away inside. With interactions between many thousands 
of atoms, proteins are so complex that scientists have not been able 
to predict their folded shapes from their amino acid sequences. 

Now, after five years of studying the complex atomic interactions 
in proteins, Stephen Mayo of the California Institute of Technology 
in Pasadena has developed a computer program that selects the 
sequence of amino acids required to yield a folded protein of a 
specific shape. With his colleague Bassit Dahlyat. he has used 



the program to create the first fully tailored protein. 

The protein, containing 28 amino acids, could be put together in 
10^7 different ways. But the researchers used the program to predict 
the sequence that would give a shape that mimics an existing protein 
type called the "zinc finger". In last week's Science (vol 278» p 82), 
the researchers say that when they synthesised the protein, it fitted 
the zinc-finger shape almost perfectly— proof that protein design is 
really possible. "This will stand as a landmark piece of work," says 
Jim Wells, a protein engineer at Genentech in San Francisco. 

Mayo hopes eventually to design proteins up to 150 amino acids 
long, the size of most natural proteins. This would enable biotechnolo- 
gists to design heat-stable enzymes for industry and make more 
effective drugs. "It would be an awesome achievement to design a 
protein that would do what you want it to," says Mayo. Bob Holmes 
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Development of a cytokine analog with enhanced 
stability using computational ultrahigh 
throughput screening 
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Abstract 

Granulocyte-colony stimulating factor (G-CSF) is used worldwide to prevent neutropenia caused by high- 
dose chemotherapy. It has limited stability, strict formulation and storage requirements, and because of poor 
oral absorption must be administered by injection (typically daily). Thus, there is significant interest in 
developing analogs with improved pharmacological properties. We used our ultrahigh throughput compu- 
tational screening method to improve the physicochemical characteristics of G-CSF. Improving these 
properties can make a molecule more robust, enhance its shelf life, or make it more amenable to alternate 
delivery systems and formulations. It can also affect clinically important features such as pharmacokinetics. 
Residues in the buried core were selected for optimization to minimize changes to the surface, thereby 
maintaining the active site and limiting the designed protein's potential for antigenicity. Using a structure 
that was homology modeled from bovine G-CSF, core designs of 25-34 residues were completed, corre- 
sponding to 10" '-10^^ sequences screened. The optimal sequence from each design was selected for 
biophysical characterization and experimental testing; each had 10-14 mutations. The designed proteins 
showed enhanced thermal stabilities of up to 13°C, displayed five- to 10-fold improvements in shelf life, and 
were biologically active in cell proliferation assays and in a neutropenic mouse model. Pharmacokinetic 
studies in monkeys showed that subcutaneous injection of the designed analogs resuhs in greater systemic 
exposure, probably attributable to improved absorption from the subcutaneous compartment. These results 
show that our computational method can be used to develop improved pharmaceuticals and illustrate its 
utility as a powerful protein design tool. 

Keywords: Protein design; computational screen; stability; cytokines; granulocyte-colony stimulating 
factor 



Many techniques have been used in the design of new and 
improved proteins. In vitro directed evolution methods such 
as phage display, DNA shuffling, and error-prone PCR are 
widely used. Rational design approaches continue to be ap- 
plied, and strategies that combine both are now being used. 



Reprint requests to: Bassil I. Dahiyal. Xencor, Inc.. 1 1 1 W. Lemon 
Avenue. Monrovia. California 91016. USA; e-rnail: baz@xencor.com; fax: 
(626) 256-3562. 

Article and publication are at hitp://www.proteinscience.org/cgi/cioi/ 
10.lllO/ps.458OIO2. 



Successful designs include enzymes (Chen and Arnold 
1991; Stemmer 1994; Zhao et al. 1998) and other proteins 
(Crameri et al. 1996), as well as therapeutically useful pro- 
teins such as hormones and cytokines (Lowman and Wells 
1993; Heikoop et al. 1997; Grossmann et al. 1998; Chang et 
al. 1999). The experimental techniques involve the genera- 
tion and screening of libraries of random protein sequences. 
However, the number of sequences that can be screened ex- 
perimentally is limited (about lO'"* for library panning and 10'' 
for high throughput screening). Libraries of this size allow for 
the simultaneous modification of only about 10 residues. 
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Computational methods have also been used that perform 
in silico screening of protein sequences (Hellinga and 
Richards 1994; Desjarlais and Handel 1995; Dahiyat and 
Mayo 1996, 1997a; Street and Mayo 1999; Jiang et al. 2000; 
Kraemer-Pecore et al. 2001; Pokala and Handel 2001). Ex- 
ploiting the efficiency and speed of computers, these meth- 
ods can randomly screen a vast number of sequences (up to 
10^^), allowing for the simultaneous consideration and 
modification of more than 60 residues. Searching such large 
sequence spaces drastically improves the possibility of find- 
ing novel protein sequences with improved properties. 

Investigators have recently developed a computational 
screening method that finds the optimal sequence for a de- 
fined three-dimensional . structure, allowing all or part of the 
sequence to change (Dahiyat and Mayo 1996). This method, 
termed Protein Design Automation (PDA), scores the fit of 
sequences to the three-dimensional structure using physical- 
chemical potential functions that model the energetic inter- 
actions of protein atoms, including steric, solvation, and 
electrostatic interactions. PDA couples these potential func- 
tions with a highly efficient search algorithm to accurately 
screen up to 10^° sequences. Because the screening is per- 
formed in silico, multiple simultaneous mutations can be 
made, and novel sequences that are very different from wild 
type can be discovered. The method has been validated by 
numerous experimental tests and has resulted in the design 
of new proteins with improved stability and conformational 
specificity, and novel activity (Dahiyat and Mayo 1996, 
1997a; Malakauskas and Mayo 1998; Strop and Mayo 1999; 
Shimaoka et al. 2000; Bolon and Mayo 2001 ; Marshall and 
Mayo 2001). 

PDA also has the advantage of being able to control the 
location and type of mutations. For example, the design can 
be limited to the hydrophobic core. Mutations in the core 
can produce significant improvements in protein stability 
but do not change binding epitopes on the surface of the 
molecule. Thus, the molecular surface can be kept identical 
to the native structure, retaining biological activity and lim- 
iting toxicity and antigenicity. This feature is particularly 
important in the design of therapeutic proteins. 

We wanted to take advantage of these features of PDA 
and explore its utility in the design of improved pharma- 
ceuticals. We therefore used PDA as an ultrahigh through- 
put screen for improved analogs of a therapeutic protein, 
granulocyte-colony stimulating factor (G-CSF). G-CSF is a 
hematopoietic growth factor of 174 residues that induces 
differentiation and proliferation of granulocyte-committed 
progenitor cells. It is used clinically to treat cancer patients 
and alleviate the neutropenia induced by high-dose chemo- 
therapy. G-CSF belongs to the class of long-chain four- 
helix bundle cytokines that bind asymmetrically to homodi- 
meric complexes of cell-surface receptors to initiate an in- 
tracellular signaling cascade. Their structural similarity 
allows the design strategy chosen for G-CSF to be imme- 



diately applicable to the other four-helix bundle cytokines 
(human growth hormone, erythropoietin, the interleukins, 
and interferon-a/p — all clinically important compounds) 
and thus broadens the potential impact of the results. 

Although the cytokines are functionally very efficacious, 
their pharmacological properties are not ideal. For example, 
G-CSF, like most proteins, is not absorbed orally to any 
significant extent and must be administered by frequent 
(daily) injections throughout the course of treatment. It also 
has limited stability and strict formulation and storage re- 
quirements, including the need to be kept refrigerated. Thus, 
there is significant interest in developing analogs with im- 
proved pharmacological properties. 

We sought to use PDA to improve the physicochemical 
characteristics of G-CSF. Improving these properties can 
make a molecule more robust, enhance its shelf life, or 
make it more amenable to use in alternate delivery systems 
and formulations. It can also affect clinically important fea- 
tures such as pharmacokinetics and result in a drug that is 
safer for human use. Our design strategy was to optimize the 
core to improve the stability and solution properties of 
G-CSF while preserving receptor binding and biological 
activity. 

The template structure used for in silico screening was a 
homology model of human G-CSF in which the human 
sequence was mapped onto bovine G-CSF. We designed 
several novel core sequences, cloned and expressed them, 
characterized their stabilities, tested them for functional ac- 
tivity both in vitro and in vivo, and studied their pharma- 
cokinetics in monkeys. The designed proteins showed en- 
hanced thermal stabilities, displayed five- to 10-fold im- 
provements in shelf life, and were biologically active both 
in cell proliferation assays and in a neutropenic mouse 
model. Subcutaneous injection of the most stable variant in 
monkeys also resulted in greater systemic exposure, prob- 
ably attributable to improved absorption from the subcuta- 
neous compartment. These results indicate that PDA has 
great potential as a powerful in silico tool in the design of 
improved pharmaceutical proteins. 

Results and Discussion 

Homology modeling 

The crystal structure of bovine G-CSF (PDB record 1 bgc) 
(Lovejoy et al. 1993) was used as the starting point for 
modeling because the crystal structure of human G-CSF 
(PDB record Irhg) (Hill et al. 1993) is at a lower resolution 
and is missing key fragments, including a structurally im- 
portant disulfide bond between positions 64 and 74. Bovine 
G-CSF is a good model for human G-CSF because the 
sequences are the same length and 142 of 174 amino acids 
are identical (82%). The residues that differ in the bovine 
sequence were replaced with the human residues for those 
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positions, and the conformations of the replaced side chains 
were optimized using PDA. Most of the replaced residues 
were solvent exposed, thereby introducing little strain into 
the structure and allowing typical PDA parameters to be 
used for conformation optimization. One substitution, how- 
ever, was at a buried site, G167V, and clashed sterically 
with a nearby disulfide bond. To accommodate the larger 
Val, the side-chain conformation at this position was opti- 
mized using a less restrictive van der Waals scale factor (0.6 
instead of 0.9). The entire structure was then briefly mini- 
mized to relax the strain. The final structure that served as 
the template for all the designs is shown in Figure 1. 

Core designs 

Unlike many experimental sequence screening methods, 
PDA allows control over which residues are allowed to 
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Fig. 1. Template structure of hG-CSF used for Protein Design Aulomation 
(PDA) designs. The hunnan sequence was homology modeled onto the 
bovine crystal structure (PDB record Ibgc). The residues that differ in the 
bovine sequence or were not present in the bovine crystal structure were 
replaced with the residues from the human sequence. The conformations of 
the replaced side chains were optimized using PDA (the larger Val at 
position 167 was optimized using a less restrictive van der Waals scale 
factor), and the entire structure was energy minimized for 50 steps. 



change. Core residues were selected because optimization 
of these positions can improve stability yet minimize 
changes to the molecular surface, thus limiting the designed 
protein's potential for antigenicity. Ala scanning studies of 
G-CSF indicate one or two binding sites on the protein 
surface that are probably responsible for granulopoietic ac- 
tivity (Reidhaar-Olson et al. 1996; Young et ah 1997) (Fig. 
1). Although recent crystallographic studies of G-CSF com- 
plexed to its receptor show only one binding site in a novel 
2:2 complex (Horan et al. 1996; Aritomi et al. 1999), both 
sites were avoided in the core designs to ensure preservation 
of function. 

Two PDA design calculations were run: a deep core de- 
sign that included residues deeply buried in the interior of 
the protein and an expanded core design (exp_core) that 
also included less buried peripheral core residues. The deep 
core design had 26 core positions that were allowed to vary 
(shown yellow and gold in Fig. 2), whereas exp_core had 34 
(shown yellow and turquoise in Fig, 2). Only hydrophobic 
amino acids were considered at the variable core positions. 
These included Ala, Val, He, Leu, Phe, Tyr and Trp, Gly 
was also allowed for the variable positions that had Gly in 
the bovine wild-type structure (positions 28, 149, 150, and 
167). Met and Pro were not allowed. 

Optimal sequences 

The optimal sequences selected by PDA are also shown in 
Figure 2. The optimal sequence from the deep core design 
had 10 mutations (named core 10), and the optimal exp_core 
sequence had 11 (named exp^corell); thus, 33%-38% of 
the variable residues changed their identities. Eight of the 
mutated positions changed to the same amino acid in both 
designs. Changing the set of design positions can signifi- 
cantly impact the amino acid selected at a given position. 
For example, in the deep core design, Leu89 retains the 
same amino-acid identity and conformation as wild type. 
However, in the exp_core design, when Leu92 is also al- 
lowed to vary, both positions (Leu89 and Leu92) mutate to 
Phe, indicating a coupling between these two core residues. 
The modeled structure of the sequence selected in the deep 
core design (core 10) is shown in Figure 3. 

Native human G-CSF (met hG-CSF) and the optimal se- 
quence from each of the core designs were cloned, ex- 
pressed in Escherichia coU, and purified for experimental 
studies. 

Th ermal stabil ity 

The far-ultraviolet (UV) circular dichroism (CD) spectra for 
met hG-CSF and the designed proteins were nearly identical 
to each other and to published spectra for met hG-CSF 
(Reidhaar-Olson et al. 1996; Young et al. 1997), indicating 
highly similar secondary structure and tertiary folds (data 
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Fig. 2. Sequences of hG-CSF analogs. Native human and bovine sequences are shown at the top. The fragments missing in the crystal 
structure of the human sequence are shown boxed. Variable positions are colored. The deep core design had 26 variable positions, 
exp_core had 34, and core 1 67V had 25. The oplimal sequence from each design is shown, letters indicate core residues that mutated 
relative to native hG-CSF; blanks indicate no change. Positions that changed to the same amino acid in all three core designs are 
indicated in bold. Core2 and coreS sequences were not obtained from PDA calculations but were derived by reverting some of the 
core 10 mutations to wild type. Melting temperatures (T„,s) obtained for the designed proteins are also shown. 



not shown). Thermal denaturation was monitored at 222 
nm, and the melting temperatures (T„^s) were derived from 
the derivative curve of the ellipticity at 222 nm versus tem- 
perature (Fig. 4). Thermal denaturation of G-CSF and its 
variants is irreversible; however, T^^ can be used to quickly 
assess the relative stability of different mutants. Stability 
under storage conditions, which is more relevant clinically, 
was evaluated with shelf-life studies (see below). 

The T^ for met hG-CSF was 60°C, identical to that re- 
ported in other studies (Kolvenbach et al. 1997). CorelO 
showed an increase in stability of 13°C, whereas the T^ of 
exp„core 1 1 was very similar to wild type (Fig. 2 and Fig. 4). 
The increased stability seen with corelO may be attributable 
to improved packing interactions and optimized hydropho- 
bic burial of side chains. Other possibilities include de- 
creased aggregation resulting from elimination of the free 



cysteine at position 17. The Gly to Ala mutation at position 
28 caused a significant improvement in helical propensity 
that could also be the source of the improved stability. 



Identifying critical mutations using derived sequences 

To differentiate between these possibilities, two additional 
sequences derived from the core 10 mutant sequence were 
made and their T^s measured. One of these (coreS) was 
identical to corelO except that two mutations distant from 
the others were reverted to wild type (LI 03V and V! 1 01). 
These were the two positions that did not mutate in 
exp_corel I. The T,^ of coreS was 70°C, similar to corelO, 
indicating that the mutations at 103 and 110 were not re- 
sponsible for corelO's improved stability. 
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Fig. 3. Modeled structure of hG-CSF analog {core 10) obtained from deep 
core design. Twenty-six core residues were allowed to vary; computational 
screening with PDA resulted in 10 mutations: CHL, G28A. L78R Y85F. 
L103V, VI 101, Fl 13L, V1511, V1531, and L168F. 

To determine the importance of the other mutations, an- 
other sequence was made (core2) that contained only two of 
the core 10 mutations, G28A and C17A; all other residues 
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Fig. 4. Thermal stability of hG-CSF analogs. Thermal stability was as- 
sessed by monitoring the temperature dependence of the circular dichroism 
spectral signal at 222 nm. Melting temperatures (T„,s) were derived from 
the derivative curve of the ellipticity at 222 nm versus temperature. Core 10 
and core2 showed increases in T„, of 13°C and 5**C. respectively, over 
native met hG-CSF. 



were identical to wild type (Fig. 2). The T^' of core2 was 
5°C higher than wild type, indicating that improvements in 
helical propensity and the elimination of a free cysteine are 
important for heightened thermostability. The remainder of 
the increase in T^ seen for core 10 may be attributable to 
improved packing interactions and increased hydrophobic 
burial. 



Storage stability 

Increased shelf life is important for distribution and storage 
and is a desirable feature for G-CSF and other protein drugs. 
Because aggregation and chemical degradation are the pre- 
dominant mechanisms of inactivation of G-CSF (Herman et 
al- 1996), shelf life was estimated by incubating the proteins 
at elevated temperature and then using size-exclusion chro- 
matography to observe the disappearance of monomeric 
protein. Chemical degradation was estimated using reverse 
phase chromatography (data not shown). Core2 and core 10 
showed five- and 10-fold improvements in storage stability, 
respectively, at 50°C (Fig. 5). Rate constants were deter- 
mined by a first order exponential fit of the fraction mono- 
mer remaining/time curves using KaleidaGraph (Synergy 
Software). 

Biological activity 

Granulopoietic activity was determined in vitro by quanti- 
tating cell proliferation as a function of protein concentra- 
tion in murine lymphoid cells transfected with the gene for 
the human G-CSF receptor. The designed proteins were as 
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Fig. 5. Shelf life of hG-CSF analogs. Shelf life was estimated by incubat- 
ing the proteins at elevated temperature (SOX) and using size exclusion 
chromatography to observe disappearance of monomeric protein. Rate con- 
.slants were determined by a first order exponential fit of the fraction 
monomer remaining/time curves. Core2 and core 10 showed five- and 10- 
fold improvements in storage stability, respectively, over met hG-CSF 
controls. 
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Fig. 6. In vivo granulopoietic activity of hG-CSF analogs. Mice were 
rendered neutropenic with a single intraperitoneal injection of 200 mg/kg 
cyclophosphamide (CPA). Beginning 24 h later and for 4 consecutive days, 
the mice were given a daily intravenous injection of 100 M-g/kg of native 
hG-CSF (filgrastim, Amgen), an hG-CSF analog, or saline. On day 5, 
granulopoietic activity was determined by counting the number of white 
blood cells and polymorphonuclear neutrophils (PMN). The designed ana- 
logs (coreS and core 10) were as effective as controls in eliciting a granu- 
lopoietic response. 

active as wild-type hG-CSF (data not shown). The designed 
analogs were also as effective as wild type in increasing 
white blood cell and polymorphonuclear neutrophil levels in 
the neutropenic mouse (Fig. 6). Neutropenia, characterized 
by an abnormally low level of neutrophils in the blood, was 
induced by injection of cyclophosphamide. Reversal of this 
effect by the designed analogs shows that granulopoietic 
activity was also retained in vivo. 

Pharmacokinetics 

The pharmacokinetics of core 10 and native hG-CSF (fil- 
grastim, Amgen) was studied in cynomolgus monkeys after 
a single subcutaneous or intravenous injection of 5 |Jig/kg 
and after daily subcutaneous injections of 5 |xg/kg for 28 d. 
Analysis of the serum concentration-time curves shows that 
subcutaneous injection of the designed analog results in 
greater systemic exposure (area under concentration-time 
curve, AUC) than the same dose of wild-type hG-CSF (Fig. 
7B). This was true after a single dose on day 1 (78.8 vs. 54.6 
ng-h/mL, data not shown), as well as on day 28 (37.2 vs. 
17.4 ng-h/mL). There were no measurable differences in 
serum half-life. In the intravenous study, however, the half- 
life of core 10 was three-fold shorter (1 vs. 3 h), and the 
AUC was significantly less (54.7 vs. 1 17.4 ng-h/mL), indi- 
cating that core 10 is cleared faster (Fig. 7 A). Taken to- 
gether, these data indicate that the designed analog is ab- 
sorbed more quickly from the subcutaneous compartment 
(absorption could not be measured directly given the small 
number of data points at early times). Improved absorption 
may be attributable to decreased aggregation or association 
of the designed protein. The increased monomer lifetime 
and decreased aggregation seen in our shelf-life studies and 



the improved thermal stability of the native conformation 
observed for core 10 indicate a decrease in aggregation in 
the subcutaneous compartment. This possibility is sup- 
ported by the fact that other protein therapeutics engineered 
for reduced aggregation also show faster absorption rates. 
For example, insulin Lispro and other rapid-acting insulin 
analogs that were designed to decrease their tendency to 
self-associate are absorbed faster than regular insulin after 
subcutaneous injection (Howey et al. 1994; Home et al. 
1999). 
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Fig. 7. Pharmacokinetics of hG-CSF analogs. Plasma concentrations of a 
designed hG-CSF analog or wild-type hG-CSF (filgrastim, Amgen) were 
determined after administration in cynomolgus monkeys. (A) Animals were 
given a single intravenous injection of 5 or {B) daily subcutaneous 

injections of 5 M-g/kg for 28 d. Noncompartmental analysis of the serum 
concentration-time cur\'es shows that subcutaneous injections of the core 10 
analog resulted in greater systemic exposure (area under concentration- 
time curve. AUC) than the same dose of wild-type hG-CSF, whereas there 
was no change in serum half-life (tt/,). In the intravenous study, the AUC 
was significantly less and the ti^^ three- fold shorter, indicating that core 10 
was cleared faster. 
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Comparison to published G-CSF variants 

In vitro and cassette mutagenesis studies have shown that 
alterations of the N-terminal region of G-CSF can lead to 
improved granulopoietic activity (Kuga el al, 1989; Okabe 
et al. 1990). Point mutations at Cysl7 have also been found 
to affect shelf life; replacement with Ala led to an increase, 
Ser had no effect, and large residues (He, Tyr, Arg) led to a 
decrease (Ishikawa et al. 1992). In contrast, our core 10 se- 
quence, which has a large residue (Leu) at this position, 
showed an improved shelf life. This may be explained by 
the observation that in a CyslVLeu point mutant, Leu's side 
chain would clash with the aromatic ring of the nearby Phe 
al position 1 13. This sleric clash does not occur in core 10, 
however, because the Phe at 1 13 is replaced by Leu and, in 
compensation for this change, two nearby Leu's become 
Phe's (al positions 78 and 168). Thus, multiple mutations 
allow complementary repacking of ihe hydrophobic core in 
the core 10 mutant and may be responsible for its enhanced 
stability and shelf life. 

Significant improvements in thermal stability were also 
observed when the seven helical Gly residues in G-CSF 
were replaced with Ala to form point, double, and triple 
mutants (Bishop et al. 2001). Substitutions at positions 26, 
28, 149, and 150 were the most effective. The investigators 
attributed the stabilizing effect to the enhancement in a-he- 
lical propensity associated with the Gly/Ala substitutions. 
These data support our suggestion that the heightened ther- 
mal stability seen with our mutants (which also contain a 
Gly/Ala substitution at position 28) is at least in part attrib- 
utable to an improvement in helical propensity. 

Probing the robustness of PDA with 
a homology modeled core position 

As pointed out previously, the homology modeling of hu- 
man G-CSF onto the bovine structure was straightforward 
for the most part because the replaced residues were prima- 
rily solvent exposed and no rearrangement of the backbone 
was necessary. The change at one core position, however, 
G167V, induced a steric clash and energy minimization of 
the entire protein was used to relieve the strain. We decided 
to assess the impact of this manipulation by doing an addi- 
tional design (core 167V) in which the variable residues 
were essentially the same as in the deep core design except 
that position 167 was also allowed to vary. We found that 
Val 1 67 mutated to Ala (the other mutations were essentially 
the same as for core 10). To probe the plasticity of the core, 
instead of using this PDA optimal sequence, which only had 
two mutations in this region, we ran experiments on another 
high-scoring sequence (corel4_V167A) that had additional 
mutations (14 total, including L157L F160W, and L161F). 
This sequence was chosen because it balanced an extensive 
number of mutations with a relatively high design score. 



Although it ranked 21st in the sequence energy list and was 
2 kcal/mole less favorable than the optimal sequence, it was 
still biologically active and as stable as wild type (T^ of 
61 °C) (Figs. 2, 4). This indicates that optimization with 
PDA is fairly robust, and that the protein core can be quite 
plastic and can accommodate large changes without sacri- 
ficing stability or function. 

Conclusions 

PDA is a powerful ultrahigh throughput computational 
screening method. Its ability to screen up to 10^*^ sequences 
and allow multiple simultaneous mutations significantly in- 
creases the likelihood of finding new and improved pro- 
teins. In this study, PDA was used to develop improved 
analogs for a therapeutically important protein, hG-CSF. 
The novel proteins showed enhanced thermal stabilities and 
shelf life while retaining biological activity. Analysis of the 
mutants and results obtained with derived sequences indi- 
cates that the heightened stability is attributable to improve- 
ments in helical propensity and the elimination of a free 
cysteine; improved core packing and optimized hydropho- 
bic burial of side chains may also be important. Pharmaco- 
kinetic studies indicate that subcutaneous injection of the 
most stable variant results in greater systemic exposure, 
probably attributable to improved absorption from the sub- 
cutaneous compartment. 

These results show that PDA can be successfully applied 
to proteins of therapeutic interest. They also illustrate the 
value of its precise control over the site and type of muta- 
tions, allowing for the rational design of desired properties 
such as improved stability and pharmacokinetics and the 
elimination of undesirable ones such as toxicity and antige- 
nicity. These features are particularly important in the de- 
sign of therapeutic proteins. PDA thus has great potential as 
a powerful in silico tool for therapeutic protein design. 

Materials and methods 

Template structure preparation 

The template structure for the designed proteins was produced by 
homology modeling using the crystal structure of bovine G-CSF 
(Brookhaven Protein Data Bank code Ibgc) as the starting point. 
The program BIOGRAF (Molecular Simulations Inc.. San Diego, 
CA) was used to generate explicit hydrogens on the structure, 
which was then minimized for 50 steps using the conjugate gra- 
dient method and the Dreiding II force field (Mayo et al. 1 990). 
The residues that differ in the bovine sequence or were not present 
in the bovine crystal structure were replaced with the human resi- 
dues for those positions. The conformations of the replaced side 
chains were optimized using PDA (Dahiyat and Mayo 1997a,b), 
and the entire structure was minimized again for 50 steps. This 
minimized structure was used as the template for all the designs. 
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Protein design 

Analogs of hG-CSF were designed by simultaneously optimizing 
residues in the buried core of the protein using PDA. The compu- 
tational details, residue classification, potential functions, and pa- 
rameters used for van der Waals interactions, solvation, and hy- 
drogen bonding are described in previous work (Dahiyat and Mayo 
1996, 1997a). An expanded version of the backbone-dependent 
rotamer library of Dunbrack and Karplus (Dunbrack and Karplus 
1993) was used in all the calculations. The global optimum se- 
quence from each design was selected for characterization and 
experimental testing, except for core 167V in which the 21st ranked 
sequence was used. Calculations were generally performed over- 
night using 16 processors of an SGI Origin 2000 with 32 R 10000 
processors running at 195 MHz. The length of the runs varied from 
1 to several hours of CPU time. 



Cloning and expression 

A gene for met hG-CSF was synthesized from partially overlap- 
ping oligonucleotides (-100 bases) that were extended and PCR 
amplified. Codon usage was optimized for E. coli and several 
restriction sites were incorporated to ease future cloning. These 
partial genes were cloned into a vector and transformed into E. coli 
for sequencing. Several of these gene fragments were then cloned 
into adjacent positions in an expression vector (pET17 or pET21) 
to form the full-length gene for met hG-CSF (528 bases) and 
transformed into £. coli for expression. Protein was expressed in E. 
coli in insoluble inclusion bodies and its identity was confirmed by 
immunoblot of SDS-PAGE using a commercial mAb against 
hG-CSF. 



Refolding, purification, and storage 

The protein inclusion bodies were solubilized in detergent and 
refolded in the presence of CUSO4 to promote formation of native 
disulfide bonds (Lu et al. 1992). A size-exclusion column (10 
mm X 300 mm loaded with Superdex prep 75 resin purchased from 
Pharmacia) was loaded with protein and eluted at a flow rate of 0.8 
mL/min using the column buffer (100 mM Na2S04, 50 mM Tris, 
pH 7.5). The peaks were monitored at dual wavelengths of 214 nm 
and 280 nm. Albumin, carbonic anhydrate, cytochrome C, and 
aprotinin were used to calibrate the molecular size of proteins 
versus elution time. The monomeric peak that elutes around the 
expected elution time for each protein was collected and the buffer 
was exchanged into 10 mM NaOAc at pH 4 for biophysical char- 
acterization. For long-term storage, a buffer of 5% sorbitol, 
0.004% Tween 80, and 10 mM NaOAc at pH 4 was used. A pH of 
4 was chosen for these buffers to be consistent with the commer- 
cial formulation of hG-CSF (Amgen), which was used as a control. 
The proteins were >98% pure as judged by reversed phase high 
performance liquid chromatography (HPLC) on a C4 column (3.9 
mm X 150 mm) with a linear acetonitrile-water gradient containing 
0.1% TFE. The identities of all proteins were confirmed by com- 
paring the molecular mass measured by mass spectrometry with 
corresponding molecular mass calculated using the protein se- 
quences. 

Spectroscopic cha racteriza t ion 

r 

Protein samples were 50 fjiM in 50 mM sodium phosphate at pH 
5.5. Concentrations were determined using UV spectrophotometry. 
Protein structure was assessed by CD. CD spectra were measured 



on an Aviv 202DS spectrometer equipped with a Peltier tempera- 
ture control unit using a 1-mm path length cell. Thermal stability 
was assessed by monitoring the temperature dependence of the CD 
signal at 222 nm (Kolvenbach et al. 1997). A buffer of 10 mM 
NaOAc was used at pH 4.0 and data were collected every 2.5°C 
with an averaging time of 5 sec and an equilibration time of 3 min. 
Thermal denaturation curves were smoothed using KaleidaGraph. 
The melting temperature (T^) of each protein was derived from 
the derivative curve of the ellipticity at 222 nm versus temperature. 
The T^ values were reproducible to within 2°C for the same pro- 
tein at the concentrations used. 



Storage stability 

I 

The storage stability of the designed proteins was assessed by 
incubation at both 37°C and 50*^0 under solution conditions iden- 
tical to that used in the commercial formulation of hG-CSF (fil- 
grastim, Amgen). Because aggregation and chemical degradation 
are the predominant mechanisms of inactivation of G-CSF (Her- 
man et al. 1996), accelerated degradation was followed by observ- 
ing the disappearance of monomeric protein with both size-exclu- 
sion and reverse-phase chromatography. Rate constants for shelf- 
life estimation were determined by a first-order exponential fit of 
the fraction monomer remaining/time curves using KaleidaGraph 
(Synergy Software). 

Cell proliferation assay 

Granulopoietic activity was measured by quantifying cell prolif- 
eration as a function of protein concentration using Ba/F3 (murine 
lymphoid) cells stably transfected with the gene encoding the hu- 
man Class 1 G-CSF receptor (Avalos et al. 1995). Cell prolifera- 
tion was detected by 5-bromo-2'-deoxy uridine (BrdU) incorpora- 
tion quantified by a BrdU-specific ELISA kit (Boehringer Mann- 
heim). 



In vivo biological activity 

Granulopoietic activity was determined in the neutropenic mouse 
(Haitori et al. 1990). C57BL/6 mice were rendered neutropenic 
with a single intraperitoneal injection of 200 mg/kg cyclophospha- 
mide (CPA). Beginning 24 h later and for 4 consecutive days, the 
mice were given a daily intravenous injection of 100 (xg/kg of an 
hG-CSF analog, met hG-CSF produced in our laboratory, clini- 
cally available hG-CSF (filgrastim, Amgen), or saline. On day 5, 
6 h after the fmal dose, the animals were killed, blood samples 
were collected, and granulopoietic activity was determined by 
counting the number of white blood cells and polymorphonuclear 
neutrophils. 



Pharmacokinetics 

Plasma concentrations of a designed hG-CSF analog or wild-type 
hG-CSF (filgrastim, Amgen) were determined following adminis- 
tration in cynomolgus monkeys. Animals were given a single in- 
travenous injection of 5 M^g/kg or daily subcutaneous injections of 
5 M^g/kg for 28 d. In the intravenous siudy. blood samples were 
collected at 0 (predose), 5, 15, and 30 min and 1, 2. 4. 6, 8, 12, and 
24 h postdosing. In the subcutaneous studies, blood samples were 
collected at 0 (predose), 1, 2, 4, 6, 8, 12, and 24 h postdosing on 
day 1 and day 28. All samples were immediately placed on wet ice 
and centrifuged at 28°C. The resultant plasma was then frozen and 
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stored {-lO'^C). Plasma concentrations were determined using an 
enzyme-linked immunosorbent assay (Quantikine human G-CSF 
ELISA, R&D Systems, Minneapolis, MN), performed per manu- 
facturers instructions except that samples were diluted in PBS, 5% 
nonfat dry milk, and 0.05% Tween 20, and the incubation was 
extended to overnight at 4°C. Plasma concentrations of the de- 
signed hG-CSF analog and filgrastim were estimated from their 
corresponding standard curves. Pharmacokinetic parameters were 
calculated by noncompartmental analysis. The terminal slope (Xz) 
was estimated by linear regression through the last time points of 
the log concentration versus time curves and used to calculate the 
terminal half-life (t,/2). The area under the curve from time of 
dosing through the last time point (AUCq.^) was calculated by the 
linear trapezoid method. 
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IhactivatTon of TNF Signaling by 

Rationally Designed 
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Tumor necrosis factor (TNF) is a key regulator of inflammatory responses 
and has been implicated in many pathological conditions. We used 
structure-based design to engineer variant TNF proteins that rapidly form 
heterotrimers with native TNF to give complexes that neither bind to nor 
stimulate signaling through TNF receptors. Thus/ TNF is Inactivated by 
sequestration. Dominant-negative TNFs represent a possible approach to 
anti-inflammatory biotherapeutics, and experiments in animal models show 
that the strategy can attenuate TNF-mediated pathology. Similar rational 
design could be used to engineer inhibitors of additional TNF superfamity 
cytolcines as well as other multimeric ligands. 



TNF is a proinflaminatory cytokine that cao 
conylex two TNF x©ceptors, TNFRl (p55) and 
TNFR2 (p75), to activate signaling cascades 
coatxolUng apopto^ inflaxnmatioa, cell proUf- 
eiation, and the immune response (/-5). The 
26-kD type II transmembrane TNF precursor 
protein, expressed on many cell types, is pro- 
teolytically coiweTtcd into a sohible 52-kD ho- 
motrimer (d). An elevated senim level of TNF 
is associated with the pathophysiology of rfaeu- 
maloid aidnitis (RAX inflammataiy bowel dis- 
ease, and ankylosing spondylitis (1, 7, % and 
molecules thai inhibit TNF signaling have dem- 
onstrated clinical efHcacy in treating some of 
these diseases (P, J(f). 

We have engineered dominant-negative 
TNF (DN-TNF) variants that inactivate the 
native homotrimer by a sequestration mech- 
anism that blocks TNF bioactivity (fig. SI). 
Protein design automation (PDA), an in silico 
method that predicts protein variants with 
improved biological properties (IJ-J3), was 
used to introduce single or double amino acid 
changes into TNF (Fig. I A) to generate the 
desired, biological profile while maintaining 
the overall structural integrity of the mole- 
cule. Specifically, our goal was to design 
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homotrimeric TNF variants that (i) have de- 
creased receptor binding, (ii) sequester native 
TNF homotrimers fiom TNF recq>tors by 
formation of inactive native.vaiiant heterotri- 
mers, (iii) abolish TNF signaling in relevant 
biological assays, and (iv) are easily ex- 
pressed and purified in large quantities from 
bacteria. Variants were tested for TNF recep- 
tor activation in cell-based assays, and non- 
agonistic variants were then checked for their 
ability to antagonize native TNF in cell and 
animal models. Subsequently, we ev^uated 
assethbly statis, receptor binding, and hetero- 
trimer formation for several variants. 

The con^Mitational design strategy used 
crystal structures of native and variant TNF 
tjimeis as tenq)l8tes for the simulations. Anal- 
ysis of a homology model of the TNF-recqjtor 
complex revealed several distinct regions of the 
cytokme that make oniltiple direct oootacts with 
. its recqrtors (Fig. 1 A), including interfeces rich 
in hydrophobic and electrostatic interactions. 
We ran sinmlat'ons to select nommmunogenic 
point mutations that would disnqit receptor in* 
teractions while preserving the structural integ- 
rity of the TNF variants and their ability to 
assemble into heterotrimers with native TNF 
{J4), Many of the designed TNF variants dis- 
played markedly reduced binding to TI^^l 
and TNFR2, and several combinations of po- 
tent ^ngle mutations iurther decreased binding 
(Fig. IB and fig. S2). As predicted by analysis 
of the TNF-TNFR stnicnual con^lex, combi- 
nations of the most potent single mutations at 
differem interaction domains (e.g., A HSR and 
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I97T) were firequemly additive or synergistic. 
Moreover, our data extend .results of previous 
studies (JS, 16) m demonstrating that certain 
substitutions can alter die specificity of receptor 
interactions; for example* 197T and A HSR/ 
I97T show greater relative binding to TNFR2 
than to TNFRl (Fig. IB and fig. S2). 

Homotrimcrs of several designed TNF vari- 
ants exhibited > 10,000-fold reduction in dieir 
ability to activate two nu^or signaling pathways 
downstream of TNF receptor activation. Spe- 
cifically, single variants such as YSTH and 
A145R and the double variant A145R/Y87H 
were unable to bind to either TNFRl orTNFR2 



in cell-£nee assays (Fig. IB, and fig. S2) and 
&iled to activate either caspase (Fig. IC) or 
nuclear Victor kB (NF-KB>-iuediated lucifierase 
expression (Fig. ID) relative to native TNF. In 
contrast, those variants that still bound TNFRs 
also activated TNF signaling; in some cases 
(e.g., F144N) more potently than native TNF. 
Disruption of two receptor interftces (e.g., 
A145R/Y87H or A145R/I97T) cffecdvely de- 
stroyed the residual agonism detected wtdi 
some single-point TNF variants. Furthennore, 
the tnqwrtance of using multiple screening cri- 
teria to evaluate DN-TNF bioactivity was re- 
vealcd by variants such as A145R/197T, which 



had virtually no TNF-like activity in either cell- Al^ 

based assay yet displayed appreciable TNFR2 acti' 

btodii^ affinity. ^^^3 

We subsequently tested nonagonistic TNp mas 

variants for their ^ility to act as doxninant. solu 

negative inhibitor by measuring their capadfy pote 
to block native TNF activity in cell-based assays. 

We evaluated dose-dependent TNF antagonism port 

{14) by mixing increasing concentrations of Sim 

variants with native TNF ( 5 ng^ml) for 1 .5 houn sing 

and measuring ihc caspase activity induced by lari> 

dicsc mixtures after addition to U937cclb (Fig van: 

2A). At concentrations as low as twofold diat of weU 

native TNF (10 ng/ml), A145R/Y87H aod by! 



Fig. 1. DN-TNF variants 
have impaired TNF recep- 
tor binding and signaling. 
(A) Structural schematic 
of human TNF trimer- 
TNFR1 complex with nw- 
jor contacts between 
Itgand and receptor high- 
lighted by solid surfaces 
(green). Locatlorts of rep- 
resentative mutated resi- 
dues substituted in do- 
minant-negative variants 
are shown in boxes. (B) Increasing concentra- 
tions of DN-TNF homotrimers were incubated 
with a fixed concentration of either TNFR1 
(black bar) or TNFR2 (white bar), and the bind- 
ing affinity (Kj was measured. Ttic histogram 
illustrates the effect of mutations on binding 
affinity between DN-TNF variant homotrimers 
and TNF receptors. (C and D) To measure TNF- 
induced signaling, we incubated increasing con- 
centrations of native TNF (•) or the variants 
F144N (O), I97T (□). Y87H (■). A145R (a), 
A145R/Y87H (*). and A145R/I97T (O) with 
either U937 cells to measure TNF-lnduccd 
caspase activity (C) or HEK 293T cells trans- 
fected vnth an NF-KB-tudferase reporter plas- 
mid to measure TNF-lnduced transcriptional 
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activation (D). DN-TNF variants, especially the double mutants, have reduced TNF receptor binding and signaling activity. RLU, relative luclferase unit^ 
Caspase activity, arbitrary units normallred to 
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Fig. 2. DN-TNF variants inhibit TNF-mediated IntraceUular sIgnaUng (A) 
Inhibition of caspase activation. Native TNF {5 ng/ml) was mixed with 
buffer (#) or with increasing concentrations of either TNF variants 
A145R (^), A145R/Y87H (*). or A145R/I97T (O), the soluble FC-TNFR2 
fusion eUnercept or the TNF monoclonal antibody inflixinMb (▼). 
After 1.5 hours of Incubation in exchange buffer ()4). these mixtures 
were applied to U937ceUs to stimulate caspase activity. (B) Inhibition of 
NFkB pathway activation. Natke TNF (•) (25 |ig/ml) was mbted with 
20-fold excess (by mass) of A145R (&). A14SR/Y87H (*), A14SR/I97T 



(O), etanercept ('). or inflbdmab (▼). These mixtures were serially 
diluted and appUed to HEK 293T cells for 12 hours to induce NF-kB- 
luclferase reporter acth^lty. (C) A145R/Y87H Inhibits native TNF-lnduced 
nudear translocation of the p65-RelA subunit of NF-kB. ImmunofluorcS' 
cence studies show subcellular localization of NF-kB in HeLa cells treated 
with buffer (panel 1). nath^ TNF (10 ng/ml) (panel 2). A145R/y87H 
(100 ng/ml ) (panel 3). or the combination of native TNF and variant 
A14SR/Y87H (panel 4). RLU. relative luciferase units; Caspase activity, 
arbitrary units normalized to Scale bar, 25 jtm. 
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AI45R/I97T attenuated TNF-induccd caspase 
activity by 50%, and at 20-fold excess, activity 
was JtduoEd to baseline. The m vitro potency (b^ 
(oass) of Ibese variants is con^)azable to that of a 
soluble FO-TNFR2 fusion (etaiiercq)t) and more 
potent dian that of an antibody to TNF (vaUix- 
imab), two marketed anti-TNF therapies, sup- 
porting the potential utility of this mechanism. 
Similarly, at 20-fold excess over native TNF. 
dngl&i)oint (A145R, 1971. Y87H) and particu- 
larly (bublfrpoint (A145R/Y87H, A145R/I97T) 
variants decreased caspase activation (fig. S3) as 
wen as TNF-induced transcriptional activation 
by NF-kB in human embryonic kidney (H£K) 



293T cells (Fig. 2B). Consistent with these re- 
suits, the TNF variant AHSR/YSTH (at 10-fold 
excess over native TNF) blocked TNF-induced 
nuclear trans)ocati<m of &e NF-kB p65-RelA 
suhunit in HeLa cells (Fig. 2C), Thus, a number 
of variants neutralized TNF-induced caspase and 
NF-KB-mediated transcriptional activity over a 
wide range of native TNF concentrations, in- 
chiding the climcally relevant range of 100 to 
200 pg/col found m the synovial fluid of RA 
patients {1 7- J 9), 

To demonstrate that the niecbanisin of TNF 
inhibitira requires the formation of heterotri- 
menc complexes with native TNF, we measured 
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3, DN-TNF variants inhibit signaling by sequestering native TNF Into inactive heterotrtmcrs. 
{m Inverse correlation betv^een heterotrimer formation and caspase activity. Native TNF was 
mixed In exchange buffer {14) with A145R/Y87H (v. A145R/I97T (□, ■). or etanercept (O. 
as desaibed for Fig. 2. A part of each mbrture was analyzed by a sandwich EUSA to detect 
naUvevariant TNF (open symbols), and the remainder was used to stimulate caspase activity in 
U937 cells (dosed symbols). Caspase activity, arbitrary units normalized to V . (B) Native gel 
analysis of heterotrimer fonmation with various ratios of native (N) and DN-TNF (V). FlAC-tagged 
native TNF was incubated atone (N,:V^ lane lOrO) or with inaeasing concentrations of His-tagged 
variant A14SR/Y87H (lanes 10:1 to 10:100) before native gel electrophoresis to determine 
heterotrimer formation. The differences in isoelectic point conferred by the epitope ta^ allowed 
for resolution of all possible trimer species (NsrVo. N^iV,. N,:V2. and NoiVj). Increasing concen- 
trations of DN-TNF variant caused the redistribution of native TNF into both hcterotrimers, and at 
10-fold excess all detectable native TNF was consumed. 
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Fig, 4. DN-TNF variants exhibit effkacy In vivo. (A) Effect of heterotrtmers of various ratios In the 
galactosamine-sensftized mouse model of human TNF-induced endotoxemia. Native human TNF 
was dosed at 30 |Ag/kg and A145R/Y87H was dosed at the indicated ratios to a fixed native human 
TNF dose of 30 >ig/kg except at the 1:50* ratio, where A145R/Y87H was dosed In SO-fold excess 
of native human TNF (75 |ig/iffl). Uvers were harvested and samples were blinded and scored for 
apoptotic damage on a scale ofO to 4 as desaibed {14), < 0.05. (B) Efficacy of A145R/Y87H 
in the rat 7-day established CIA model A145R/Y87H was modified to Introduce a PEG moiety at 
residue 31 of a non-epitope-tagged molecule as desaibed {14). One group of four animals was 
nonarthritic (♦); the remaining animals were collagen treated and. after the onset of ^ptor^is, 
they were randomized into groups of eight Animals were treated with vehicle (o). variant at 10 
nigAg twice daily dosing (□]. or variant at 2 mg/kg subcutaneousty with an Intravenous loadlr(g 
dose of 2 mg/kg on the first treatment day (■). Measuren^ents of anWe diameter were made daily . 
by caliper. ♦/>< 0.05. 
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the zelatioD between heterotrimer levels and in- 
hibition of TNF-induccd signaling (14). We 
generated beterotrimedc oonq^Jexes by mixing a 
fixed amount of FLAO-tagg&d native TNF witt) 
increasing concentrations of His-tagged TNF 
variants. A part of &is material was used in a 
sandwich en^me-linked nnmunosorbent assay 
(ELISA) (Fig. 3A, open symbok) to detect the 
fbrmatioii of His-FLAG hetccotrimcis. and tiie 
remainder was ^lied to U937 cells to detect 
TNF-mediated caspase activation (Fig. 3A, 
closed symbolsX The extent of heterotrimer for- 
mation of AMSRA'STH or A145R/I97T with 
native INF coirelated with a decrease in 
csspase activation, demonstrating an inverse re- 
lation between signaling and heterotiimei' for- 
matioQ. As expected, etanercept activity is inde- 
pendent of TNF monom» exchange (Fig. 3 A, 
open circles) because etanercept binds to tiie 
TNF trintier. To directly visualize betetotiimer 
fonsation, wo mixed FLAG-tagged native TNF 
with Hia-tagged DN-TNF and resolved tiie ex- 
changed products using native polyaoylamide 
gel electrophoresis (PAGE) (Fig. 3B) (20). Elec- 
trppboiesb of equinsolar quantities of mixed 
DN-TNF and native TNF resolved the variant 
honootiinaer. 1 :2 and 2 : 1 nativervariant hetero- 
trimers, and native homotrimer in approximate- 
ly the expected 1:33:1 ratio (Fig. 3B, lane 
10:10). Western blot analyses (14) with anti- 
bodies against the epitope tags confimied the 
compositiaa of the intemiediate species (fig. 
S4). Stochastic equih'brium modeling of native 
and variant TNF beterotiimer assembly predicts 
tiiat 10-fold excess of variant homotrimer causes 
the loss of more than 99% of homotrimeric 
native TNF. pnmarily into 1:2 native:vahaat 
heterotrimeis, and our results confirmed this 
(Fig. 3B, lane 10 : 100). Exchange reactions be- 
tween native and variant TNF reached —80% 
conviction at 20 min, and essentially all the 
native homotrimer was depleted afier 90 min 
(fig. S5). Finally, we confimied that biological 
activity of variants requires exchange into het- 
erotrimeric con4)lexes with native TNF. Specif- 
ically, our most potent variants (e.g.. A145R/ 
Y87H) &iled to block caspase activity induced 
by chemically cross-linked native TNF homotii- 
mers (J4\ which are unable to dissociate to 
allow exchange wiih variant TNFs (fig. S6). 

The most potent in vitro inhibitors were se- 
lected for testing in vivo, to further studly the 
mechanisan and to begin tiierapeutic lead candi- 
date idffltificatioa We tested the bioactivity of 
variant homotrimer and nativeivariant heterotri- 
mers in tiie EKgalactosamine (OalNVsensitized 
mouse model, which demonstrated tiiat DN- 
TNF homotrimers. and heterotrimer^ with native 
TNF, are devoid of agonist activity and efficient- 
ly exchange with endogenous TNF in vivo. 
GaIN is a known specific hepatq-toxin tiiat can 
increase tiie sensitivity of mice to human TNF 
by 1000-fold (21, 22). Native human TNF (30 
ftg/kg) induced severe hepatocelhilar apoptosis 
and lethality, consistent with previous reports 
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(23); in contrast, A145R/Y87H dosed as higih as 
30 mg per kUogram of mouse body weight 
(along with native TNF at 30 (ig^) resulted in 
DO moitafa'ty or hepatotoxicity (Fig. 4A) (24), 
Similarly, lethal doses of native Inunan TNF (30 
p.g/kg) mixed before ii^ection with varying ra^ 
tios of A145R/Y87H produced no TNF-induced 
damage. This protection was observed at riatiye: 
vHrionl ratios as low as 1 : 1 arnl with a siq;)efle^ 
&al dose of TNF (Fig. 4A). Further, sandwich 
BLISA analyses of serum samples indicated that 
a substantial portion (30 ng^ml) of administered 
A14SR (3 mg/kg) was in heterotiiroers witfi the 
endogenous mouse TNF at 1 hour. 

A145R/Y87H was next assessed in a 
model of chronic disease as an initial test of 
the DN-TNF antagonism mechanism in a 
disease-relevant setting. We selected the rat 
7-day established coUagen-induced arthritis 
.(CIA) model because it simulates chronic 
autoinaimune joint disease and can be treat- 
ed by TNF blockade (25). When dosed after 
the onset of symptoms, only interventions 
with rapid onset of action would be. able to 
affect disease progression in this model, 
thus requiring rapid exchange in vivo of 
TNF variants with endogenous TNF. To 
ensure that there were no confounding in 
vivo effects of using affinity-tagged vari- 
ants, we produced A145R/Y87H that 
lacked such tags. Further^ to decrease in 
vivo clearance, we added one polyethy* 
lene glycol (PEG; ^5 kD/molecule) to each 
monomeric subunit of A14SR/Y87H. This 
modification had no effect on the 
dominant-negative properties of the mole* 
cule in vitro (fig. S7). A145R/Y87H re- 
duced joint swelling in the CIA model 
when dosed once daily at 2.0 mg/kg; subcu- 
taneously with a loading dose of 2.0 mg/kg 
and twice daily at 10 mg/kg intravenously 
(Fig. 4B). These results dernonstrate the 
potential of DN-TNFs to inhibit TNF-mc- 
diated inflammation and verify that ex* 
change occurs rapidly enough to affect pro- 
gression of acute symptoms when dosed 
therapeutically. 

GiveQ their high-yield bacterial prodwtion, 
theoretical low immunogeoicity, mid unique 
mechanism of action, DN^TNFs show potential 
as a new class of anti-inftammalory therapy, 
particularty because existing medK)dologies (It,, 
PEG izKxtiiication) can be used to further en- 
hance their phamiacokinetic prc^xities (26, 27). 
Further, we propose that this docniiuzit-oegative 
ai^jroach should be tested for its potential to 
create inhibitors of other multtmeric extracellular 
signaling molecules, in particular other members 
of the TNF superfemily (e.g., RANKL, CD40L. 
and B AFF) that have been implicated in human' 
pathophysiology (2^, 29). 
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the gene pool within breeds. Many of the 
-^360 known genetic disorders in dogs re* 
semble human conditions, and their causes 
may be more tractable in large dog pedigrees 
than in small, outbred human famiiies (4, S). 
The combination of genetic homogeneity and 
phenotypic diversity also provides an oppor- 
tunity to understand the genetic basis of many 
complex developmental processes in mam- 
mals (6). 

Because of the costs of sequencing mam- 
malian genomes to completion, these projects 
have been restricted to a few species that sit 
considered to be of greatest value to biomed- 
ical research. The decision as to whether 
future projects should aim for complete se* 
qaence coverage of a few more genomes* oc 
whether the existing 'Yeference genomes 
can be exploited to characterize a wider va- 
riety of genomes that are sequenced to > 
lower level of coverage, must be made. Ha^ 



The Dog Genome: Survey 
Sequencing and Comparative 
Analysis 

Ewen F. Kirkness,^ Vineet Bafna,^* Aaron L Halpem.'* 
Samuel Levy,^* Karin Remington/* Douglas B. Rusch,^* 
Arthur L Deldier.^ Mlhal Pop,^ Wei Wang,^ Ctaire M. Fraser/ 
J, Craig Venter* 

A survey of the dog genome sequence (6^2 million sequence reads; 1.5X 
coverage) denwnstrates the power of sample sequencing for comparative 
analysts of mammalian genomes and the generation of species-specific re- 
sources. More than 650 million base pairs (>25%) of dog sequence align 
uniquely to the human genome, including fragments of putative orthologs for 
18.473 of 24,567 annotated human genes. Mutation rates, conserved synteny. 
repeat content, and phytogeny can be compared among human, mouse, and 
dog A variety of polymorphic elements are Identified that will be valuable for 
mapping the genetic basis of diseases and traits in the dog. 
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Program determines amino adii^ sisq^ 
from three-dimensional structure ^ - - ^ 



coraputer program developed by re- 
MMsearcheis at .California Institute of 
irll'Teclmology can identify amino 
add sequences capable of folding into the 
three-diiiiensional structure of a specific 
proton. The work coiild lead to a deeper 
imderstanding of the way protein se- 
quence specifies structure and could also 
provide a new method for creating modi- 
fied proteiiis with enhanced properdes. 

Qiemistiy graduate student Bassil I. 
Dahiyat and biology professor and 
Howard Hughes Medical Institute investi- 
gator Stephen L Mayo used their compu- 
tational design algorithm to design a syn- 
thetic protein that mimics the fold .'of a 
2inc firiger, a naturally occurring DNA- 
binding element found in some transcrip- 
tion f^OTS [Science, 278, 82 (1997)]. The 
natural zihc finger has only 28 residues; 
but It contains the three main secondary 
structures that make up all proteins ([5- 
sheet, d-hielix, and turn deinents), plus a 
zinc Ion. The Caltech researchers' synthet- 
ic zinc finger mimics the secondary struc- 
tures but does riot include a metal ion. 

"Using Jih algorithm to solve the in- 
verse folding problem— that is, to find 
sequences that will adopt a given fold— 
Dahiyat and Mayo have taken protein en- 
gineering to a new high," conmients 
George D. Rose, professor of biophysics 
and biophysical chemistry at Johns Hop- 
kins University. 

"Creation of the de novo zincless finger 
involved authentic engineering: design, 
implementation, and confirmation that the 
product satisfied its design specs," Rose 
points out. "The successful execution of 
all three stqps in diis procesis has been a 
long time in coming." Rose and coworkers 
haVe developed one of the most success- 
ful aJgorithms for solving the opposite 
problem— deterinining the 3-D structure 
of a protein torn its amino add sequence. 

Biochemistry professor William F. De- 
Grado of the University of PennsyWania 
School of Medicine, who specializes in 
protein structure, folding, and design, 
writes in the saine' issue of Science that the 
zinc-finger miihic "is the smallest protein 
known to be capable of folding into a 
unique structure without the thermody- 
namic assistance of disulfides, metal ions, 
or other subunits. This important accom- 



plishment illustrates the aripr^ave ability ; 
of Dahiyat and Mayo's progtoi- to design^ 
highly optimized sequences." 

The "spectacular success" achievedjDy 
I>ahiyat and Mayo, DeGrado notes, si^gSts 
that "die niles of protein folding' and'cont ' 
putational rriethods for de novo, design ; 
may now" be siifiiciently defined to allow 
the engineering of a variety of proteias." 

DeGrado tells C&EN that the work also 
"brings us closer to the design of entirely 
nonnatural biomimetic polymers with pre- 
determined 3-D structures. There are 
many components to Mayo's potential en- 
ergy function, but the most important of 



If^queace. selected by % pix^ 
'W^fomid eaj]^^ adopt v 

-that ^closely approxi-' 
of a ziic finger, even though 
jauniG'Srainino acid seqitienccrbcairs 
llMqii^emblance to that of ?a:> zinc fin- 
I^^Gr: ^iiat of any otiier known protein, 
IJfeitthatj^ the mimic's amino- 

^UTeisidues are identical to^those of the 
natural: protein; another five^ are similar, . 
and the other 17 are completely different. ; 
— :Tlie zinc-finger mimic docs not repro- 
duce .the natural protein's DNA-binding 
fimctibi "Our goal was not to retain d^e 
furM^ipn but jijst to retim t^^ 
the foid^" says Nkyo^^We' are attempting, 
however, to desig^ zinc-finger variants that* ; 
do,:in feet, still bind to DMA and peihaps. 
chai^ the DNA45inding specificity of the 
molecule, which would .be very .exciting." 

The program's predictive abilities have . < 
so far been demonstrated only for the zinc : 
finger. But the researchers believe die al- 




Structure of zlnc-flnger mimic synVieslzed by DMyat^^dMa^^^^ 

that of natural zinc finger protein. Purple s/ihere Is Sric Ion. " ^ ' * ^ 

Reprinted wtth permission from Science, copyright 1997 AAAS 



them are based on atom-level interactions 
that are not specific to proteins. Hence, 
the program could be generalized to aid in 
the ccinstnioion of a variety of polymers 
that fold in predictable geometries.'' 

A protein with 28 amino adds can have 
1.9 X 1(F different possible sequences. The 
task of the program devised by Dahiyat and 
Mayo was to selea the optimal sequence 
fiom among' diese possibilities. The algo- 
rithm's undeiiying strategy is to repeatedly^ 
find and eliminate sequences Uiat are a bad 
fit to the target stmcture so it can rapidly 
converge on a solution. The program as- 
sesses interactions between each side 
chain and the protein backbone, as well as 
interactions between side chains, and it 
uses a madiematical proposition called the 
dead-end elimination theorem to make its 
calculations computationally manageable. 



gorithm will prove capable of calculating , 
sequences f<H" a range of other.proteins as 
well. Dahiyat has. now moved to Xencor, 
Pasadena, Calff., a company founded to 
commercialize the new technology. 
' The protein-folding problem—how 
structure is derived from sequence— is 
amor^ .the most significant unsph?ed issues 
in chemistry and biology. "What we've 
been able to do is obviously not to solve die 
protein-folding problem," says Mayo, "but 
to make a large inroad into solving the in- 
verse folding problem-;- which is, given 
some target structure, how do you design ' 
a sequence that will fold to that struc- 
ture?" This knowledge could lead to a bet- 
ter understanding of die underlying physi- 
cal chemistry of protein folding. 

In addition, the technology should be 
useful for protein design and modification. 
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Gii^dtiailptdbiems 
teoittiitt«^fo^niouiit 

ffi^e ibssis ancf ^ work- 
ers' hS^^nisetf chemical plastics 
mmuficSuds^ to. Union PaT 

dficCUP5 liaflroad to adeafening roar. 
;;'list ydtf 5? merger of Southern Pacific 
witi^te- basiled to service disruptions, 
th^^ySSB^e western IJ.S. and the Gulf 
Coaslg:^^ much chemical traffic havmg 
no alSmative to rail, UFs problems are 
having a devastating effea on business, es- 
; /j^^ in the chemical industry-rich Gi^^ 
: Qiaurt i^^ Sept. 1, page 17X 




Rail cars stranded In Houston . 

The Chemical Manufecturers Assodar 
tion (CMA) now^estimates losses at more 
than $100 million and growr^ at $36 mil- 
lion monthly for chemical producers in 
UP service areas. 

"The railroads can make or break the 
plastics and chemical industry." says So- 
ciety of the Plastics Industry (SPD Presi- 
dent Larry L Thomas.; "And for that rea- 
son SPI and CMA have no alternative but 
to take aggressive action to seek inmiedi- 
ate relief.* 

In a joint statement released Oct 1, GMA 
and SPI stater "We do not befieve that UP 
atone can scrfvc the problems it has creat- 
ed We continue to urge the U.S- Surface 
Transportation Board [STB] to intervene in 
this growing trahsportadon crisis.". 

At CMA's board of directors meeting in 
Williamsburg, Va., last month, UP Ghair- 
oaan and Chief Executive Officer Richard 



lCKDa\idspri 6xpi^ company's 
pian for iailieviating the situation ttf 50 
chemical industry executives. Additional 
details were included in UP's quarteiiy 
merger repjort filed with Slis. a divi^n of 
the Department of Transpbrtatioa 

Actions highlighted in the UP plan in- 
clude temporary diversions of some rail 
traffic, to other railroads and rerouting 
trains around congested termirials, includ- 
ing Houston. UP.estimates that Gulf Coast 
service should return to normal in 60 to 
90, days under this plan. 

But C?4A and^ SPI are not convinced. 
Both organizations have rejected UFs btr 
est plan to relieve the congestion and^are 
talking to other railroads, including Kansas 
City Southern and Buiiington Northern 
Santa Fe, about increasing their operations 
in the Gulf Coast 

Several parties have called for STB to, is- 
sue an emeigency services order, forcing 
UP to temporarily turn over parts of its op: 
eiations to the STB or other carriers. 

CMA and SPI say they will continue to 
pressure UP and STB. CMA is surveying its 
■full membership to compile a report pn 
company idsses due to lost or slowed pro- 
duction, alternate ship- 
ping, methods, and idled 
equipment. The initial 
$36 million estimate in 
monthly losses reflects 
die expefience of prijy 
IZ of dVlA's 191 mena- 
ber companies, so that 
figure is expeaed to rise. 

SPI* is focusing pn 
downstream plastics pto- 
cessprs'to gauge theloss- 
cs for businesses that are 
experiencing delayed 
shipments from dieir chemical suppliers. 

The congestion problems are only part 
of more pervasive prpWems at UP. After a 
series of fetal accidents this summer, the 
Federal -Railroad Administration (FRA) be- 
gan a safety review resulting in strong rec- 
ommendations for changes in UP proce- 
dures. In recent weeks, UP appointed a 
new vice president for safety and risk man- 
agement at FRA*s suggestion and dLis- 
missed its vice president of operations. 

Discussions of UP's problems contin- 
ue. On Oct. 3, the Railroad Gonmiissipn 
of Texas, the state Uiat produces the larg- 
est volume of chemicals, began a series 
of hearings on the rail situation in Hous- 
ton, Additional sessions are scheduled 
for Fort Worth, San Antonio, and H Paso 
in coming weeks, and chemical shippers 
are expected to be major participant^. ' 

. ..PaigeMdtse 




OlimatecKanp^idieb^ 

With December negotiations on a glo^J 
Gliinai;e change treaty in Kyoto, Ja|Min, v; c? 
just two months away, more countriesv;;- ; ^-^ 
are revealing their goals for rcduciiigV Vi^; ,. 
greenhouse gas enussipnsj : ;md claim^|i£ , 
arid counterclaiins on would co«t*| ; ; 

to ackeve those goals are fogging the a^/ 
At .the , Kyoto mcetmg, japan is plan: 
ning to propose an oyeraU red ') 
6.5% in emisjsipns bdow 19^^ r ^ 

2010. That's far less tha^ tjie Europdm / 
Union wants. The 5U a|niiounced Ipt 
spring that it backs a 15% rollfc^ck in eims; 
sions by industrialized countries by 201 Of '^J^^^,^, 
The U.S.'s is likdy to Appoint botlL - ■ .>/ 
It is now conj^ering tfuti options: stabiii: 
zation ofemissions at 1990 levels by 20101 
by 2020;'or by 203<). , 1 1 \ 

Tlie U.S.'s first option is feasible, anfc 
achieving :;it won't toin the economyi,,:^^ 
concludes, ^the Dcpairtinent of Energy^ ' ^ 
(DOE) in a just-rdcase^, ..peerrreviewal ' 7 
study. If no strorig efforts aie made to ciki^t 
back on the use of fossil fuels, the DOEs 
Energy . Information Admirustiation pro- - 
jects, diat, U.S. carbon dioxide emissions p , 
(measured as carbon) will rise to 1.73 bit{ ^ . 
lion metric tons.in 2010, up from the 1990; 
ievd of 134,biiu6n meu^ tons. :;^^^4^ 
The study, conducted by five DQE ' 
labs,, says the,p^s!"'cSiW ^iJil-- 
jcosts of sgbiiizi^ gases byg^ : 

inv^tuig m fedbiiofo^^ naturat^;^: 
gas-fiieled turbines for geiierating elet;|\? , 
tricity,, yeWdesjL^ are more fuel 
dent., and cnergy^aving a^^ and^v 
building materials. / 
"This analysis shovs^ that what's good^^^^ 
for the envirorunent also can be good for^- . 
the ecpnomx," says Biergy Secretary Fed- ; \ 
erico F. Iberia. ' :v : 

Spedficaliy, the DOE smdy looks at ei^^; ; 
ergy technolpgies smd a domestic syst«n 
of carbon dioxide emissions trading (sin^^; ;. 
lar to the sulfur dioxide trading system ; ; 
currently in place to. minimize add rain) ' ; 
that could be. used. to stabilize carbon cfr 
oxide emi^rLS. The costs oi die policies, 
examined range fxxsm $50t^on to $90biJ-^ 
Hon per year, and tbe energy cost savinj^ ; 
range from $70 bfflion to $90 bilHon aniiii- 
ally. Consequently, the net economic cost , 
would be near zero. 

''If you design a climate policy to al- 
low time and flexibility, and if you adopt 
a technology development program; /J 
then we^can use tte ingenuity of our sd- 
critists! aiiif on to achieve more 

thafli;^pebpl says Joseph 




